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The thermodynamical properties of heterogeneous DNA sequences are computed by path integral 
techniques applied to a nonlinear model Hamiltonian. The base pairs relative displacements are 
interpreted as time dependent paths whose amplitudes are consistent with the model potential for the 
hydrogen bonds between complementary strands. The portion of configuration space contributing to 
the partition function is determined, at any temperature, by selecting the ensemble of paths which 
fulfill the second law of thermodynamics. For a short DNA fragment, the denaturation is signaled 
by a succession of peaks in the specific heat plots while the entropy grows continuously versus T. 
Thus, the opening of the double strand with bubble formation appears as a smooth crossover due 
to base pair fluctuation effects which are accounted for by the path integral method. The multistep 
transition is driven by the AT-rich regions of the DNA fragment. The base pairs path ensemble 
shows an enhanced degree of cooperativity at about the same temperatures for which the specific 
heat peaks occur. These findings establish a link between microscopic and macroscopic signatures 
of the transition. The fractions of mean base pair stretchings are computed by varying the AT base 
pairs content and taking some threshold values for the occurrence of the molecule denaturation. 

PACS numbers: 87.14.gk, 87.15.A-, 87.15. Zg, 05.10.-a 



I. Introduction 



Partial separation of the DNA double helix is funda- 
mental in many processes relevant for biological function- 
ing such as transcription and replication of the genetic 
information lj. Also the packing of long DNA strands 
into nucleosomes seems related to the local opening of a 
double helix segment which may provide the key mech- 
anism for loop formation [2]. Gene transcription is pos- 
sible as the hydrogen bonds, linking the pair bases on 
the two complementary strands, can break and expose 
the bases for chemical reaction. The region of open base 
pairs (bps), the transcription bubble, is generally local- 
ized and characterized by large amplitude fluctuations 
known as the breathing of DNA. These observations have 
suggested that the bps hydrogen bonds are intrinsically 
nonlinear [3] thus putting some constraints on the mod- 
eling of the double helix dynamics and strands separa- 
tion, the DNA denaturation. The latter is driven ex- 
perimentally either by increasing the temperature or re- 
ducing the proton concentration in the solvent so that 
the repulsion between negative phosphate groups on the 
two strands is less screened. Also adsorption of DNA 
on a surface affects the denaturation properties, a pro- 
cess widely used in biotechnologies [H [5]. Thermally 
induced bubbles can be several bps long even at about 
room temperature and extend by increasing T, leading 
to the DNA melting once the complete strand separa- 
tion occurs. Such process is made evident by a sharp 
increase in the UV absorbance [5] of the DNA solution 
due to the reduction of both base pairing and stacking 
(along the strand) upon denaturation. In fact substantial 
differences occur in the UV absorption profiles for syn- 
thetical homogeneous and natural heterogeneous DNA: 



while the former denaturates within a narrow tempera- 
ture range, the latter shows multiple steps transitions [7 
according to patterns which depend both on the length 
and on the sequence [SHllj. that is on fraction and spe- 
cific order of the strongly bonded guanine- cyto sine and 
weakly bonded adenine-thymine base pairs. However, as 
a common signature to all different DNA structures, de- 
naturation is a highly cooperative phenomenon involving 
a sizeable number of bps. This follows from the fact that 
the thermal disruption of a specific inter-strand hydrogen 
bond decreases the overlap between it electron orbitals of 
the organic rings in the bases and favors the unstacking 
of intra-strand adjacent bases which, in turn, breaks the 
next hydrogen bond and ultimately opens a bubble in 
the double helix [T^] . The role of cooperativity effects in 
DNA has been recognized since long [T5HTrJ] and intro- 
duced phcnomcnologically in Ising-like two state models 
in which the bps are either closed or open. Such models 
have been applied to represent melting transitions occur- 
ing step by step in heterogeneous DNA fragments [TTlll8| . 
Later on Hamiltonian models, in which the potential en- 
ergy is continuous function of the distance between the 
bases |19j . have proposed a microscopic origin for coop- 
erativity by relating it to the anharmonic character of 
the intra-strand stacking potential [20]. The latter has 
been found responsible for a denaturation with an en- 
tropy jump corresponding to an effective latent heat rem- 
iniscent of a first order phase transition in homogeneous 
DNA [21] , However, no consensus has been reached so 
far regarding the nature of the transition, whether first 
or second order [2"2"H5U] . 

The Peyrard-Bishop-Dauxois (PBD) anharmonic 
model |20j has also proved to be consistent with a 
multistep melting envisaged by experiments in heteroge- 
neous DNA [3T] and with the formation of temporary, 
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sequence dependent openings observed by SI nuclease 
cleavage experiments (32] ■ Instead, some discrepancies 
have been pointed out between the PBD predictions 
and the denaturation curves of specific heterogeneous 
sequences [33 35 indicating that improvements in the 
theoretical modeling are still necessary. Due to the 
huge number of degrees of freedom, fully atomistic 
representations for sizeable segments of DNA require 
prohibitive computational time. Accordingly, several 
mesoscopic models have been developed to account for 
the essential interactions which determine structural 
stability, dynamics and denaturation of the molecule 



chain of N heterogeneous bps as follows 



In a recent work |41j , the imaginary time path integral 
formalism has been applied to the PBD Hamiltonian to 
investigate the occurrence of thermal denaturation in ho- 
mogeneous DNA. The transverse stretchings of the bps 
with respect to the ground state have been treated as 
one-dimensional paths x(t) depending on the imaginary 
time r whose range is set by the inverse temperature jH] . 
A path is defined by a set of Fourier coefficients and a 
single base pair displacement is taken at a specific r,. 
Then, an ensemble {x(ri)} (i = l,iV) represents a con- 
figuration for the DNA molecule made of N bps and, by 
varying the Fourier coefficients, one builds all the possi- 
ble molecule states at a selected temperature. While in 
principle the path integral is obtained by summing over 
all DNA configurations, the model potential poses lower 
and upper bounds on the specific bps elongations which 
naturally restrict the path phase space for the compu- 
tation of the partition function. The method accounts 
for the highly cooperative character of the denaturation 
which appears as a smooth second order transition in 
homopolymer DNA. 

In this paper the path integral formalism |43j is ex- 
tended to heterogeneous DNA and, in particular, to short 
fragments which are both technologically interesting for 
fabrication of DNA chips [U] and theoretically relevant 
due to the enhanced role of fluctuations far from the ther- 
modynamic limit (finite and small N) [45l |46] . Due to 
the direct integration over the bps degrees of freedom, 
the path integral method naturally incorporates fluctu- 
ation effects and seems therefore particularly promising 
in dealing with finite size DNA fragments. The PBD 
Hamiltonian and the generalities of the path integral ap- 
proach are presented in Section II. The thermodynamical 
properties for some specific DNA sequences are discussed 
in Section III together with the computation of the frac- 
tions of open bps versus temperature. Some final remarks 
are made in Section IV. 



II. Theory 

A. Hamiltonian Model 

The PBD Hamiltonian, originally introduced for ho- 
mogeneous DNA [20], is usually extended to represent a 
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Vs(y„,y n -i) + V M (y n ) 



K 



V s (y n ,y n -i) = ^g{vn,Vn-i){y n - y n -i) 2 
9{y n ,yn-i) = 1 + pexp[-a(y„ + 2/n-i)] 
Vjaiyn) = D n (exp(-a n y n ) - l Y > (!) 

where y n is the transverse stretching at the n-th site 
and measures the relative pair mates separation from 
the ground state position. The model is essentially 
one-dimensional as the longitudinal displacements, be- 
ing much smaller than the transverse stretchings, are not 
taken into account [T] . The boundary condition j/ = Vn 
closes the chain into a loop whereas, in the case of an 
open end chain with N + 1 bps, the single particle en- 
ergy for y should be added to Eq. 0. p, is the reduced 
mass of the bases which is assumed identical both for GC 
and AT bps. This is a relevant limitation of the model 
H2] which is mirrored also in the stacking potential 
Vs{yn> y-n-i) whose parameters K and p are independent 
of the type of bases at n and n — 1. In fact, K = pv 2 
with v being the harmonic phonon frequency. 

p (> 0) accounts for the anharmonicity in the stack- 
ing of nearest neighbors bps. When the molecule is 
closed, y n , y n -i "C a -1 , the effective stacking coupling is 
K(l + p). Whenever either y n > a^ 1 or y n _i > a -1 , the 
corresponding hydrogen bond breaks and the electronic 
distribution around the two pair mates is modified. Ac- 
cordingly in Eq. (nj), g(y n ,y n _i) ~ 1 and the effective 
stacking coupling (along each strand) between neighbor- 
ing bases drops to K. Then, also the adjacent base tends 
to open as both bases are less closely packed along their 
respective strands. This is the microscopic origin of the 
cooperative character (emphasized in the Introduction) 
of the interactions which determine the formation of a re- 
gion with open bps. The interplay between anharmonic- 
ity and cooperativity is thus at the heart of the PBD 
model through the form of the stacking potential. 

However the form for Vs(y n ,y n -i) in Eq. ([!]) is not 
unique and other potentials have been proposed which 
also account for the finiteness of the stacking energy at 
large intra-strand base separations [37j . Typical values 
for DNA models with intermediate anharmonicity are 

a —2 ° —1 

taken hereafter, K = 60meVA , a = 0.35A and 
p = 1 [3TJI37]. As the parameters are site independent, 
it follows that the present discussion is neglecting stack- 
ing hetereogeneities [7J . The latter may slightly affect the 
melting temperatures of specific portions of the chain 
although they are not expected to modify the nature of 
the denaturation crossover. The quantitative effects of 
the stacking hetereogeneities are left for next investiga- 
tions. 

Instead, heterogeneity is present in the Morse potential 
VM{yn) which models the hydrogen bond link between 
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Base Pair Stretching (A) 



FIG. 1: (Color online) Morse potential Vm versus base pair 
relative separation. The potential parameters D n (in meV) 
and a n (in A ) are taken for both GC and AT base pairs. 

bases on complementary strands [3H1 EH] • Depth D n and 
width a n of the potential differ for weakly bonded AT bps 
and strongly bonded GC bps. Fig. [I] shows Vm(j/«) for 
the parameters used in the following calculations. While 
hydrogen bonds may vary in a considerable range |50j . 
those in DNA are typically described by taking ener- 
gies per bond of ~ 15 — 25meV [51]. I assume here the 
lower bound taking Dat = 30meV and Dqc = 45meV 
thus accounting for the fact that AT and GC bps have 
two and three bonds respectively. a n sets the spatial 
cutoff beyond which the bps tend to open. The values 
a at = 4.2A and ogc = 5A , ensure that transverse 
stretchings are somewhat stiffer for GC than for AT bps 
although even larger values for ace ar e found in the lit- 
erature [52] . 

In spite of some arbitrariness in the parameters choice, 
the shape of V]\i(y n ) captures the fundamental features 
of the many body interactions at play between the op- 
posite strands. The repulsion of the negatively charged 
phosphate groups is described by the hard core that the 
base pair mates experience by coming too close to each 
other (y n < 0). On the opposite side, when the rela- 
tive separation grows above a given threshold, the pair 
opens and the force between the mates vanishes consis- 
tently with the plateau at the dissociation energy en- 
countered for large y n . However precisely the plateau, 
at about y n > lA in Fig. [I] reveals a drawback of the 
model: when all bps in the chain open, the two strands 
separation can grow in principle to infinite with no fur- 
ther effort as the potential energy is flat [S3]. Thus, the 
PBD Hamiltonian assumes a single chain in a infinite so- 
lution whereas experiments deal with DNA in a solvent 
structure at finite concentration hence, recombination of 
separated strands in solution is possible. Here is a case 
of biomolecule whose structure depends on the strong 



interaction with the environment, a challenge to theoret- 
ical investigation [55] . Specifically, reconciling model to 
experiment requires some restrictions of the configura- 
tion space which have been attempted either by methods 
based on molecular dynamics |56j and Monte-Carlo sim- 
ulations [57j or by truncating the kernel domain in the 
transfer integral method [SS] to prevent the two strands 
from going infinitely apart [59] . In the path integral for- 
malism, proposed in Ref. [H] and briefly outlined in the 
next Subsection, the confinement of the phase space for 
the {y n } is naturally incorporated in the computation 
after imposing a macroscopic constraint to the evolution 
of the system which is driven by the temperature. 

It is also worth pointing out that the lower bound con- 
finement for the {y n }, physically due to the hard core 
potential, ensures that the bps paths are self-avoiding at 
complementary sites along the strands. In fact base pair 
mates do not overlap. However we recognize that the 
system in Eq. ([I]), lacking of the rotational degrees of 
freedom, does not capture the helicoidal structure of the 
molecule which should be realistically embedded in the 
three dimensional space. Accordingly, also self-avoidance 
is only partially considered in our investigation as ex- 
cluded volume effects due to interactions between bub- 
bles and bounded segments in three dimensions are not 
taken into account. This effect has been shown to be rel- 
evant in polymer network theories to drive a sharp denat- 
uration transition at least in homogeneous DNA [33J [M] ■ 
On the other hand, the path integral approach to the 
Hamiltonian in Eq. ([I]) accounts for all bps fluctuations 
at any T and permits to include in the computation the 
two competing tendencies of the system: the energetic 
gain associated to the (bounded) double strands config- 
uration and the entropic gain due to the large number of 
configurations available for open strands. 



B. Path Integral Method 

The idea underlying the path integral method is that 
of mapping the real space model in Eq. ([I]) onto the imag- 
inary time scale. Accordingly, the transverse stretching 
y n is represented by a one dimensional path x(i~i) with 
Tj G [0, 0\ and ft being the inverse temperature: 

y n -> x(n), -> x(t') , r' = n - At , 

n = , N ; i = 1 , N T + 1 . (2) 

Thus, at any given temperature, the finite size system 
of N + 1 bps, is described by N T + 1 (N T = N) paths each 
of them taken at a specific Tj along the time axis. Along 
the DNA strands only adjacent bps stacking interactions 
are considered. Accordingly, n and t' in Eq. ([2| , are first 
neighbors separated by At in the discrete imaginary time 
lattice. 

I am assuming periodic boundary conditions, x(0) — 
x(f3), for all paths analogously to those imposed for the 
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ID finite chain described by Eq. Q . Then, periodic- 
ity ensures that a molecule configuration is given by N T 
paths and the retardation is: At — j3/N T . Further, any 
path x(rj) can be expanded in Fourier series with cutoff 
M F 



x(n) = x a + 



[a m cos(w„ 



sin(cj m Tj) , (3) 



with uj m = 2rmr/f3. Using Eq. ([3| has an impor- 
tant physical interpretation: for any choice of coefficients 
{xo,a m ,b m }, a single configuration {x(tj)} for the DNA 
fragment is built at a given temperature. As such coef- 
ficients can be varied in the phase space, many different 
configurations are possible at the same temperature each 
of them being a copy of the molecule in the ensemble. 
Thus, integration over the path coefficients amounts to 
sample the molecule configuration space and, in turn, to 
account for the possible evolutions of the N bps system 
in going between the time points and /?. 

As the trajectories are closed paths, the path integral 
yields the imaginary time partition function |42| which is 
given by 



A{x} = J* dr[^x{r) 2 + V s (x(r), x(r')) + V M (x(r)) 



v ^ m—1 



(x~) / ' / ""' ^ ,/7 '"' ' 



(4) 



where A{x} is the Euclidean action for the molecule 
in Eq. ([I]) after applying the mapping in Eq. |2]). The 
molecule state {x} corresponds to a specific set of Fourier 
coefficients. In practice, the dr integral is replaced by 
Y]^Z 1 (and x(t) — > x(ri)) which has to be sufficiently 
dense to make the action numerically stable. This poses 
a constraint to the application of the method to very 
short DNA fragments. Hereafter I take N T = I 00 while a 
possible extension of the method to molecules of arbitrary 
length will be mentioned in the Conclusion. 

Tlx is the measure of integration which normalizes the 
free particle action 



1)x exp 



dT—xir) 2 
2 v ' . 



(5) 



and § denotes integration over closed particle trajec- 
tories [50]. Aft is the thermal wavelength whose form in 
general depends on the model whether quantum or clas- 
sical. The latter is appropriate to the occurrence of DNA 
denaturation. Then, the time derivative y n (Eq. dTj) ) 
maps onto the imaginary time derivative x(t) (Eq. (ET), 



the proper replacement being: d/dt — > (vf3)d/dT hence, 

The above mentioned truncation of the configuration 
space is intrinsic to the path integral method as the com- 
putation of Eq. |4 1 requires a cutoff in the Fourier coeffi- 



cients integratiori |CTl 152"] . The latter has to be consistent 
with the physics contained in the model potential. Paths 
x i T i) ~ represent the equilibrium configuration for the 
double helix corresponding to the minimum Vm(^(ti)). 
Then, qualitatively, one may argue that too large coef- 
ficients would produce: i) too negative path amplitudes 
in Eq. Q which arc forbidden by the electrostatic re- 
pulsion between the sugar-phosphate backbones [33]; ii) 
too positive paths which are anyway unphysical as the 
two strands separation has an upper bound. As Fig. [I] 
makes clear, paths associated to AT bps can sample a 
spatial range somewhat broader than paths describing 
GC bps whose bonds are stiffer. Incorporating all these 
requirements it is found that the suitable set of paths 
should be searched among the xfc) <G [x m i n ,x max ], with 
Xmin ~ — 0.2A and x max ~ QA for the temperature win- 
dow hosting denaturation effects. More negative paths 
would make a vanishing contribution to the partition 
function (making the free energy F of the system nu- 
merically unstable) while larger positive paths would not 
affect the free energy derivatives. After setting the frame- 
work, the quantitative determination of the paths config- 
uration space is carried out by imposing the fulfillment 
of the second law of thermodynamics. 

The free energy derivatives, presented in the next Sec- 
tion, are obtained by F = —ft hiZ with Z given in 
Eq. Q. 

Thus Eq. Q is computed, at an initial temperature 
Tj, for a given path ensemble defined (at any Tj) by the 
number of integration points over the Fourier coefficients. 
The path ensemble is temperature dependent. Then, at 
any larger T, the numerical code re-determines the con- 
tribution to Z and calculates the free energy derivatives. 
If, for a given number of integration points, the growing 
entropy constraint is not fulfilled then the size of the path 
ensemble is increased. The procedure is reiterated until a 
minimum number of paths is found such that the entropy 
grows versus T. This method sets the T-dependent size 
of the ensemble, N e f / , whose paths satisfy boundary con- 
ditions and macroscopic physical constraints. These are 
the good paths included in the computation. N e ft is the 
number of different trajectories followed by a single base 
pair stretching in the configuration space. As the proce- 
dure holds for any Tj, the total number of paths contribut- 
ing to the thermodynamics is N T x N e f f whose value sets 
the overall system size. Good numerical convergence has 
been found taking N eff ~ 47000 at Tj = 260K [41] and 
no significant effect arises by further increasing the initial 
size of the path ensemble. 
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III. Denaturation Curves 



In heterogeneous DNA, AT-rich portions of the 
molecule tend to open at lower temperatures than GC- 
rich regions. However openings occurring at lower tem- 
peratures extend also well inside the GC domains indicat- 
ing a role for nonlocal effects in shaping multistep denatu- 
ration patterns |64j . The sequence pattern is particularly 
relevant in relatively short segments made of a few tens 
of bps which is the relevant scale for those transcription 
starting domains where the genes are read. As transcrip- 
tion and other biological phenomena require formation of 
open domains, theoretical modeling faces the questions 
to define when: a) a base pair is open, b) a molecule is 
open. Here I consider the statistical average of the i — th 
base pair elongation as given by 




i i i i i i i i i i i i i 

250 275 300 325 350 375 400 

Temperature (K) 



< x(n) >- 



Z 1 1 Dxxfa) exp[-A{x}] 



(6) 



Eq. ([6| is computed by summing over those good paths 
in the configuration space which fulfill the growing en- 
tropy constraint as described in the previous Section. 
Then a base pair is open if: < xfc) > > Cj where the 
threshold C is an arbitrary parameter at this stage. Fur- 
ther, the fraction of open bps is defined as 



/ 



1 Nr 

J20(<x(n)>-C), 



(7) 



i=l 



where #(•) is the Heaviside step function. Accordingly, 
the size of the local openings is measured by / > while 
a molecule is entirely open if / = 1. This does not im- 
ply that all molecule configurations in the ensemble are 
denaturated: I am assuming that a DNA molecule may 
exist in many different configurations which have to be 
Boltzmann weighted to get the ensemble average of phys- 
ically relevant quantities. If all the averaged elongations 
exceed a given threshold then the two strands separate. 

Somewhat different definitions for / appear in the lit- 
erature [33j [57] with some authors arguing that the UV 
absorption signal does not relate to the mean bps stretch- 
ing hence, the sum in Eq. (17| should be made over the 
statistical averages < 9(x(ri) — C) >■ While this point 
should be investigated in connection with available ex- 
periments for ensembles of short molecules |34) . here the 
focus is rather on the trend of the path integral model 
predictions for a single molecule. 

First I consider a GC-rich molecule with 100 bps whose 
sequence is: 
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GC + 6AT + GC + 22 AT + AGC + AT + AGC 
+ AT + 8GC+ [49- 100] GC (8) 

The index i (Eqs. ([6]), j7j) labels the bps running 
from left (= 1) to right (= 100). As the model de- 
pends on the relative positions between the pair mates, 



FIG. 2: (Color online) Sequence L48(AT30)+GC[49-100] in 
the temperature range which shows denaturation. (a) Num- 
ber of paths (for a single base pair stretching) larger than lA, 
1.5A and 2A; Total Number of paths (N T x N e ff) contribut- 
ing to the partition function. Inset: number of paths (per 
base pair) whose amplitude is larger than lA, 1.5 A and 2A 
respectively, over N T x N e ff. (b) Entropy versus tempera- 
ture, (c) Specific Heat versus temperature, (d) Fractions of 
mean base pair stretchings calculated by Eq. «7J for f = lA 
and £ = 1.5A respectively. Inset: mean base pair stretchings 
at four specific sites. 
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GC following GC cannot be distinguished from GC fol- 
lowing CG. Closer to reality descriptions should include 
16 stacking interactions. The results for the sequence in 
Eq. ^ are summarized in Fig. [2] for a temperature win- 
dow which features all the relevant denaturation effects. 
The numbers of path amplitudes exceeding lA, 1.5A and 
2A respectively are plotted in Fig. [2|a) together with 
N T xN ef f which ranges between ~ 4.7- 10 6 and — 7-10 6 . 
The insets displays the path amplitudes normalized over 
N T x N e f f . All plots generally show a steady but not dra- 
matic increase versus T due to the dominance of strongly 
bounded GC pairs. Some exceptions are however signif- 
icant: at T ~ 350A", N T x N e ff increases by over 2 • 10 s 
paths while two slightly less pronounced enhancements 
are found at T ~ 310A (1.6 • 10 5 paths) and T ~ 375A 
(10 5 paths). The total number of paths contributing to 
Z markedly increases when some groups of bps weaken 
their bonds signaling the interplay between cooperativ- 
ity and denaturation. These features are macroscopically 
seen in the plot of the specific heat (Fig. |2jc)) whereas 
the entropy (Fig.|2jb)) displays small irregularities at the 
same T values and maintains an overall continuous be- 
havior. The complementary microscopic explanation is 
provided by Fig. W[d) which plots Eq. Q for two choices 
of the threshold, £ = lA and ( = 1.5A respectively. 
In fact the fraction of mean stretchings exceeding lA, 
with respect to the double helix equilibrium configura- 
tion, shows somewhat appreciable jumps at about the 
same temperatures given above while the fraction larger 
than 1.5A becomes sizeable above T ~ 325A. Anyway 
/ never reaches the unity for such threshold values. The 
overall pictures emerging from Fig.[2]is that of a continu- 
ous tendency towards denaturation essentially promoted 
by the AT sites whose mean stretchings are generally 
larger than those for the GC pairs: this is made evi- 
dent by the inset in Fig. |2][d) where Eq. ^ is plotted 
for i = 5, 15,25,45. Also note that the i = 15,25 sites 
belong to a wider homogeneous AT region than the i = 5 
site hence the former display larger average elongations 
than the latter. 

Now I take a AT-substrate in the right side of the frag- 
ment keeping the same sequence for the first 48 sites: 



GC + 6AT + GC + 22 AT + AGC + AT + 4GC 
+ AT + 8GC + [49 - 100] AT (9) 

The results for the fragment in Eq. ([9| are shown 
in Fig. [3] The portion of the path configuration space 
sampled by the computation is much larger than in the 
previous case with a strong increase at T ~ 380AT and 
N T X N e ff ~ 18 • 10 6 at T = 390A. The path frac- 
tions exceeding lA, 1.5A and 2A respectively ( inset in 
Fig.|3ja) ) are similar to the previous case. However there 
is now a substantial increase of the absolute path num- 
bers contributing to the denaturation with about three 
to six million path amplitudes broader than lA in the 
upper temperature range. Consistently two more peaks 



appear in the specific heat plot beside the three ones al- 
ready found in Fig.[2jc). Looking at Fig.^d), we see that 
the fraction of mean base pair stretchings larger than lA 
attains the unity at T ~ 318A pretty close to the first 
peak encountered in the specific heat (T ~ 311A). As 
subsequent steps are found in the denaturation pattern 
at larger T it may follow than £ = lA underestimates 
the real threshold for the overall molecule denaturation. 
Or, the ensemble average procedure entering the defini- 
tion of / in Eq. ^ may not fully capture the occurrence 
of the molecule denaturation. While this issue deserves 
further work here we note that the mean path amplitudes 
at specific sites (inset in Fig.[3^d)) are significantly larger 
than those for the previous fragment (inset in Fig. 2]^d)): 
this effect is due to the substrate made of weakly bound 
AT pairs. Even the i = 5 AT site feels the change of 
the substrate (in spite of the distance along the fragment 
backbone) pointing to the importance of nonlocal coop- 
erativity effects. Conversely the GC pairs at the first and 
third site may be viewed as the presence of two defects 
embedded in the AT-rich sequence to the left side. As 
the defects affect their surroundings [55] the i = 5 site 
mean amplitude is somewhat smaller than that of other 
AT sites having homogeneous neighbors. 

Eventually, the role of the AT bps is emphasized in 
Fig. [4] where / is computed for three cases: the AT 
sites content is increased/reduced by eight units (with 
respect to Fig. |3| in the first part of the sequence while 
the [49 - 100] segment is kept fixed. The L48(AT22) 
sequence correspond to the L48AS sequence of Ref.pH] 
which shows a two steps melting transition but a broad 
AT substrate is here attached to the sequence itself. 
Then, no direct comparison is possible. Again / is com- 
puted by taking two threshold values as before. By 
adding (removing) eight AT sites, the temperature value 
such that / attains the unity (for £ = lA) shifts down- 
wards (upwards) by about 10A. A significant increase 
in / (for £ = 1.5 A) is also found at large T for the AT- 
richest sequence. Taking for good the functional form in 
Eq. Q, a qualitative agreement is found with the melt- 
ing profile calculated by Monte Carlo simulations of the 
PBD model [57J where the same definition for / is as- 
sumed. A comparison between the L48(AT22) sequence 
m Fig. gaud the L48AS sequence in Ref . [57] suggests that 
a threshold £ ~ 1-1 A permits to get / = 1 at T ~ 345 A 
in both plots. 



IV. Conclusion 

The temperature driven strands separation in hetero- 
geneous DNA sequences has been studied by applying the 
path integral formalism to the nonlinear Peyrard-Bishop- 
Dauxois Hamiltonian model. Essentially the method con- 
sists in mapping the relative base pairs elongations onto 
the imaginary time scale set by the temperature. A time 
index labels each base pair which is thus described by 
all those paths, computed at tj in the path configuration 
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FIG. 3: (Color online) Sequence L48(AT30)+AT[49-100]. (a) 
Number of paths (for a single base pair stretching) larger 
than lA, 1.5 A and 2A; Total Number of paths (N T x N cff ) 
contributing to the partition function. Inset: fractions of 
paths whose amplitude is larger than lA, 1.5 A and 2A re- 
spectively, (b) Entropy, (c) Specific Heat, (d) Fractions of 
mean base pair stretchings calculated by Eq. (T7J for £ = lA 
and £ = 1.5A respectively. Inset: mean base pair stretchings 
at four specific sites. 




Temperature (K) 



FIG. 4: (Color online) Fractions of mean base pair stretch- 
ings (Eq. Q) larger than lA and 1.5A respectively for three 
sequences : L48(AT30)+AT[49-100], L48(AT22)+AT[49-100] 
and L48(AT38)+AT[49-100]. 



space, which are compatible with the model potential and 
fulfill the macroscopic constraint given by the second law 
of thermodynamics. The computational method requires 
that the entropy has to grow versus temperature but no 
ansatz has been made regarding the shape of the entropy 
curves. The continuity found in the latter is consistent 
with the view that the strand separation is an overall 
smooth crossover similar in this respect to the case of 
homogeneous DNA. The model has been applied to short 
fragments for which chain fluctuation effects are generally 
expected to broaden the transition region PEj . In fact the 
molecule denaturation appears here as a multistep phe- 
nomenon, promoted by the AT-rich regions, whose long 
range effects may gradually extend over the whole frag- 
ment. The denaturation steps are signaled by a few sig- 
nificant enhancements in the number of paths which par- 
ticipate to the partition function although such enhance- 
ments are much less sharp than those previously found 
in homogeneous DNA. These findings are consistent with 
the fact that cooperativity is higher in homopolymers 
than in heteropolymers as, in the latter, different por- 
tions of the chain denaturate at different temperatures. 
The specific heat shows sharp peaks at about the same 
temperatures for which anomalies in the path numbers 
plots occur. Beside a main transition peak at T ~ 350^, 
our DNA sequences display some shoulder peaks whose 
frequency grows with a larger AT base pairs content. 
However some arbitrariness remains in the definition of 
the threshold for the occurrence of the overall molecule 
denaturation and much theoretical work remains to be 
done to unravel this issue. 

The present conclusions regarding the smoothness of 
the denaturation are at variance with previous studies 
of the PBD Hamiltonian [ST] suggesting that denatura- 



8 



tion is a first order thermodynamic transition microscop- 
ically driven by the backbone stiffness parameter both 
in homogeneous and heterogeneous sequences [3T]. In 
fact the latter studies considered somewhat longer frag- 
ments than those I have taken but this should not be 
the source of the discrepancy regarding the character 
of the transition as the smooth crossover persists also 
by increasing the system size. Also some polymer net- 
work analysis based on the Poland-Scheraga model for 
DNA [22 point to a sharp denaturation which should be 
ascribed however to self-avoidance effects for the three 
dimensional molecule rather than to backbone stiffness. 
While the debate is open both inside the PBD Hamil- 
tonian- and the Poland-Scheraga model- research fields, 
the path integral results here presented show that an- 
harmonic stiffness alone should not change the character 
of the transition in heterogeneous DNA. Some improve- 
ments in modeling heterogeneous specific sequences are 
certainly expected by taking stiffness parameters which 
appropriately account for the stacking interactions along 
the molecule backbones. This feature however is not ex- 
pected to modify the nature of the crossover predicted 
by the path integral method. Instead, I feel that a main 
reason of divergence with respect to previous Hamilto- 
nian studies lies in the fact that Eq. Q incorporates 
all the path fluctuations around the ground state of the 
double strand structure. These fluctuations, included in 
the computational method, soften the effect of the en- 
tropic barrier associated to the stiffness and ultimately 
smoothen the crossover between the double strand con- 
figuration and a state with open domains. 

Further, among the mesoscopic models capturing the 
essentials of the complex DNA interactions, the path in- 
tegral method has the advantage to account for a re- 
markable number of molecule configurations in a short 



computational time. Nonetheless some limitations re- 
garding both model and method should be here recog- 
nized with the purpose to be lifted in next investigations. 
First, the path integral in Eq. Q describes a one dimen- 
sional system: extensions to higher dimensionality may 
permit to fully include self-avoiding paths in the com- 
putation. Second, the space-time mapping technique in 
Eq. ([2]) may be modified by removing the correspondence 
between one base pair and one point Ti along the imagi- 
nary time axis. By freeing the time from such constraint, 
each base pair stretching would maintain the full time de- 
pendence and one point in the path configuration space 
would correspond to one molecule whose different con- 
figurations could then be obtained by tuning the time. 
Accordingly the configuration space spanned by the com- 
putation would describe an ensemble of molecules each of 
them existing in an ensemble of different states. In this 
way the length of the molecules in the ensemble would 
become a free parameter thus allowing us to examine 
the denaturation process both for long DNA chains and 
fragments with only a few tens of base pairs. Analysis of 
long sequences may permit to check the role of the path 
fluctuations approaching the thermodynamic limit. On 
the other hand, short fragments are particularly interest- 
ing also in view of the fact that experiments capable to 
detect intermediate states in the melting transition are 
becoming available. 
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